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ABSTRACT 

We present a new Monte-Carlo algorithm to generate merger trees describing the for- 
mation history of dark matter halos. The algorithm is a modification of the algorithm 
of Cole et al. (2000) used in the GALFORM semi-analytic galaxy formation model. 
As such, it is based on the Extended Press-Schechter theory and so should be ap- 
plicable to hierarchical models with a wide range of power spectra and cosmological 
models. It is tuned to be in accurate agreement with the conditional mass functions 
found in the analysis of merger trees extracted from the ACDM Millennium N-body 
simulation. We present a comparison of its predictions not only with these conditional 
mass functions, but also with additional statistics of the Millennium Simulation halo 
merger histories. In all cases we find it to be in good agreement with the Millennium 
Simulation and thus it should prove to be a very useful tool for semi-analytic models 
of galaxy formation and for modelling hierarchical structure formation in general. We 
have made our merger tree generation code and code to navigate the trees available 
at http://star-www.dur.ac.uk/~cole/merger_trees . 
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' 1 INTRODUCTION 

In hierarchical models of structure formation, such as 
■ ACDM, the formation of a dark matter (DM) halo through 
accretion and repeated mergers can be described by a merger 
, tree (Lacey & Cole 1993). The merger trees, which list the 
' progenitors of a given halo at a series of redshifts and de- 
scribe the sequence in which they merge together, contain 
essentially all the information one needs about the DM 
when building models of the other processes involved in 
galaxy formation. Thus, the merger trees, whether extracted 
from N-body simulations such as the Millennium Simulation 
(Springel et al. 2005) or generated by Monte-Carlo (MC) 
algorithms (e.g. Sheth & Lemson 1999; Somerville & Ko- 
latt 1999; Cole et al. 2000), provide the framework within 
which one can model the additional astrophysical processes 
of galaxy formation (Cole et al. 2000; Kauffmann & White 
1993; Somerville & Kolatt 1999). 

The statistical properties of Monte-Carlo merger trees 
based on the approximate Extended Press-Schechter (EPS) 
theory (Bond et al. 1991; Bower 1991; Lacey & Cole 1993) 
are not in perfect agreement with those built from high res- 
olution, highly non-linear N-body simulations (e.g. Cole et 
al. 2007). We have found that when the same semi-analytic 
galaxy formation model is run first with MC trees and 
then with N-body trees there can be significant differences 
in the properties of their resulting galaxy populations. In 
some ways these differences are minor as small changes in 
the uncertain parameters of the star formation and feed- 



back prescriptions can often bring them back into align- 
ment. However, it would be far better if MC and N-body 
merger trees were in much better agreement. For instance, 
this would allow galaxy formation models to be run first on 
MC trees and parameters including cosmological parameters 
tuned to match observed galaxy properties in advance of 
running an expensive N-body simulation which will furnish 
the positional information needed to make galaxy cluster- 
ing predictions. Additionally, as N-body simulations always 
have poorer mass resolution than can be obtained with MC 
merger trees, one would like to be able to use MC trees of 
varying resolution to assess the impact of the limited resolu- 
tion of the N-body simulation. This can be hard to achieve 
when the two sets of trees differ systematically. 



In this paper we present a modification to the MC 
merger tree algorithm of Cole et al. (2000) that we tune to 
be in accurate agreement with the statistical properties of 
the Millennium Simulation merger trees that were presented 
in Cole et al. (2007) . In Section 2 we describe both the origi- 
nal EPS MC algorithm as implemented in Cole et al. (2000) 
and our modification. Section 3 compares the results of our 
modified algorithm with the original and with the statistics 
of merger trees from the Millennium Simulation. We briefiy 
discuss relationship of our algorithm to other models in Sec- 
tion 4 and conclude in Section 5. 
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2 THE MONTE-CARLO ALGORITHM 

In the following sections, wc briefly review the Monte-Carlo 
algorithm implemented in the GALFORM semi-analytic 
code (Cole et al. 2000) and then describe how we modify 
it to achieve more accurate agreement with simulation data. 



if R < P, then wc generate a random value of Mi in the 
range Mrcs > Mi > M'z/'l, consistent with the distribution 
given by equation (3), to produce two new halos with masses 
Ml and M2(l — F) — Mi. The same process is repeated 
on each new halo at successive rcdshift steps to build up a 
complete tree. More details are given in Appendix A. 



2.1 The GALFORM Algorithm 

The merger tree algorithm employed in the GALFORM 
semi-analytic model uses as its starting point the conditional 
mass function 
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given by extended Press-Schechtcr theory (Bond et al. 1991; 
Bower 1991; Lacey & Cole 1993). Here /(M1IM2) represents 
the fraction of mass from halos of mass M2 at redshift Z2 that 
is contained in progenitor halos of mass Mi at an earlier rcd- 
shift zi. The linear density thresholds for collapse at these 
two redshifts are 5i and 82 (e.g. Eke et al. 1996) . The rms 
linear density fluctuation extrapolated to « = in spheres 
containing mass M is denoted cr(M) with ai = a (Mi) and 
0-2 = cr(M2). Taking the limit of /(M1IM2) as «i — > a2 one 
finds, 
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which implies that the mean number of halos of mass Mi 
into which a halo of mass M2 splits when one takes a step 
d^i up in redshift is 



dN 



df M2 



dzi (Ml < M2). 



(3) 
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Then, by specifying a required mass resolution, Mres, for the 
algorithm on can integrate to determine 
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dMi 



dMi, 
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which is the mean number of progenitors with masses Mi in 
the interval M^b < Mi < M2/2 and 



F = 



dN Ml 



dMi M2 



dMi, 



(5) 



which is the fraction of mass of the final object in progenitors 
below this resolution limit. Note that both these quantities 
are proportional to the redshift step, d^i, by virtue of equa- 
tion (3) 

The GALFORM merger tree algorithm then proceeds 
as follows. Firstly, choose a mass and rcdshift, z, for the 
final halo in the merger tree. Then, pick a rcdshift step, 
d^i, such that P <^ 1, to ensure that the halo is unlikely 
to have more than two progenitors at the earlier redshift 
z + dz. Next, generate a uniform random number, R, in the 
interval to 1. If J? > P, then the main halo is not split 
at this step. We simply reduce its mass to M2(l — F) to 
account for mass accreted in unresolved halos. Alternatively 



2.2 The Modified Algorithm 

The binary merger algorithm described above fully respects 
a natural symmetry that whenever one fragment has mass 
Ml the other must have mass M2 — A4i (at least in the limit 
of Mrcs 0). This means that it is not consistent with EPS 
theory as equation (3) does not satisfy this symmetry of re- 
maining unchanged when Mi — » M2 — Mi (Lacey & Cole 
1993; Benson et al. 2005). To force the required symmetry 
the algorithm only uses equation (3) for Mi < M2/2 and 
ignores its predictions for Mi > M2/2. The algorithm is 
also unsatisfactory because the EPS conditional mass func- 
tions, and also the original Press-Schechter mass function, 
do not accurately match what is found in N-body simula- 
tions (Sheth & Tormen 1999; Jenkins et al. 2001; Cole et al. 
2007). However, many statistical properties of the merger 
trees produced by the above algorithm have trends with 
mass and redshift that agrees well with those of merger 
trees constructed from high resolution N-body simulations, 
but with increasing redshift they systematically underes- 
timate the mass of the most massive progenitors (Cole et 
al. 2007). (In practice, the problem is ameliorated in GAL- 
FORM by starting the merger tree construction at higher 
rcdshift.) Here, our aim is to reduce these systematic differ- 
ences. Given the simplicity and zeroth order success of the 
original GALFORM algorithm, it seems reasonable to try 
modifying it by perturbing the basic function that drives 
the algorithm. Namely we consider replacing the function 
defined in equation (3) by making the substitution 



dN 
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G{ai/a2, 62/(72). 



(6) 



Here G{gi/ 02,82/(72) is the "perturbing" function which we 
expect to be of order unity for most of the range of in- 
terest. The choice that the function G should only depend 
on the ratios 01/02 and 82/02 is motivated by the desire 
that the algorithm should preserve self-similarity if used in 
a flat firr, = 1 cosmology with scale free initial conditions 
(e.g. see Efstathiou et al. 1988). The dependence on 82/02 
allows the halo splitting rate to be modified as a function 
of M2/M,, where the characteristic non-linear mass, M,, 
is defined by a(M,) = 8, while the dependence on 01/02 
allows the mass distribution of the resulting fragments to 
be modified. Restricting the dependence of the function to 
only these parameters is necessary to preserve self-similarity, 
but on its own does not guarantee self-similarity. The addi- 
tional unwanted freedom wo hope to remove by fitting to the 
statistical properties of the Millennium Simulation merger 
trees, as presented in Cole et al. (2007). Note that since the 
merger tree algorithm described above only makes use of 
equation (6) for progenitor masses Mi < M2/2 it is only the 
behaviour of G(ai/CT2,(J2/o"2) for Mi < Af2/2 (cti > 0-2) that 
is constrained by comparison to the Millennium Simulation 
merger trees. Consequently the predictions of equation (6) 
for Ml > M2/2 are of no relevance. 
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Figure 1. The fraction of mass in progenitor halos of mass Mi in bins of logjQ M1/M2 at redshifts zi = 0.5, 1, 2 and 4 as indicated, for 
three different masses M2 (indicated at the top of each column). The histograms show the results from the Millennium Simulation while 
the dotted and dashed curves are the corresponding conditional mass functions given by the original GALFORM Monte-Carlo algorithm 
and our new modified algorithm respectively. The solid curve shows an analytic fit to the whole set of conditional mass functions as 
described in Cole et al. (2007). The vertical dotted line indicates the 20 particle mass resolution of the Millennium simulation. 
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To simplify the problem still further we make the as- 
sumption that 

G{a-,/a2,d2/a2) = Go (^)'" (^)^', (7) 

which can be considered as a first order Taylor scries approx- 
imation for InG in terms of ln{ai/a2) and ln((52/a"2)- This 
functional form is particularly convenient. The two terms Go 
and (62/0-2)''^ have no dependence on Mi and so just enter 
the integrals in equations (4) and (5) as multiplicative con- 
stants. The term (a"i/a2)''^ alters the distribution dN/dMi 
and the integrands in both (4) and (5), but has simple ana- 
lytic properties that allow a very fast implementation of the 
splitting algorithm (see Appendix A). 



3 COMPARISON WITH THE MILLENNIUM 
SIMULATION 

The Millennium Sirrmlation (MS, Springcl ct al. 2005) is, to 
date, the largest N-body simulation of a cosmologically rep- 
resentative volume. It uses N = 2160^ particles in a comov- 
ing cube of side L = 500h~^ Mpc to follow the non- linear 
gravitational evolution of a Gaussian random density field 
drawn from a power spectrum consistent with cosmological 
constraints from 2dFGRS (Percival et al. 2001) and the first 
year WMAP data (Spergel et al. 2003). The cosmological 
density parameters arc flm = 0.25, Qb ~ 0.045 and = 
0.75, the Hubble parameter h = Ho/100 km s"^ Mpc"^ = 
0.73 and the linear amplitude of the density fluctuations in 
spheres of radius 8h~^ Mpc is as = 0.9. At each of over 60 
output times a catalogue of friends-of-friends (Davis et al. 
1985) groups was constructed and the descendant of each 
group found at the subsequent timestep. Details of the con- 
struction of these merger trees and and their statistical prop- 
erties can be found in Cole et al. (2007). Below we compare 
a variety of the statistics they estimated with the results of 
the original Cole et al. (2000) GALFORM MC algorithm 
and our new modified algorithm. 

3.1 Conditional Mass Functions 

In Fig. 1 we compare the conditional mass functions of the 
MS merger trees with those of the MC algorithms. Here 
for halos of various masses M2 at redshift 22 = we find 
what fraction of their mass is in progenitor halos of mass 
Ml at various earlier redshifts zi. The histograms show the 
results of the MS while the solid curves show an analytic 
fit described in Cole et al. (2007). The results of the origi- 
nal GALFORM algorithm are shown by the dotted curves. 
As has been noted by Cole et al. (2007) these conditional 
mass functions evolve more rapidly than those of the sim- 
ulation. Thus, the "GALFORM 2000" algorithm strongly 
underpredicts the number of high mass progenitors at high 
redshift. That the EPS theory gives predictions that evolve 
more rapidly with redshift than is found in N-body simu- 
lations has been noted previously (e.g van den Bosch 2002; 
Wechsler et al. 2002; Lin et al. 2003). GiocoU et al. (2007) 
have shown that average halo formation times agree bet- 
ter with the elliptical collapse model of Sheth et al. (2001) 
than with the spherical collapse EPS formalism. Further- 
more, Giocoli et al. (2007) find that scaling the time variable 



in EPS theory by the factor ^ = ^/0l07 = 0.84 that comes 
from fitting the elliptical collapse model to N-body data re- 
sults in formation time predictions that better match the 
N-body data for a wide range of final masses and redshifts. 
By reference to equation (3) it can be seen that our factor 
G(ai/ff2, 52/172) of equation (6) can be viewed as a mod- 
ification to the timestep dzi. Thus, the elliptical collapse 
modelling of Giocoli et al. (2007) suggests that we should 
find G(a\ 1 02,^2! 02) ~ 0.84. In fact, we expect a somewhat 
lower value as the Monte Carlo trees of the original GAL- 
FORM algorithm evolve even more rapdily than the analytic 
predictions of the EPS formalism (Cole et al. 2007). 

To find the best fit parameters we have minimised the 
rms difference 

CTcmf = ((logio/cm'(Ml|M2) -logio/c';S(Ml|M2))'y^'(8) 

between MS data and the results of the MC algorithm over 
all twelve panels in Fig. 1. With the exception of the lowest 
two mass bins plotted in each panel, which were discarded as 
they are influenced by the mass resolution of the MS, equal 
weight was given to each bin. Initially we kept 71 = 72 = 
fixed and allowed only Go to vary. For Go = 1, the original 
GALFORM algorithm, the rms difference (Tcmf = 0.27. The 
best fitting value of Go is 0.79, as anticipated, somewhat 
smaller than the factor 0.84 from Giocoli et al. (2007), and 
this reduces the rms difference significantly to cTcmf = 0.12. 
However, this is a compromise and the data from the dif- 
ferent panels of Fig. 1 prefer different values of Go. This 
can be accommodated by allowing 71 and 72 to vary. As 
52/(72 is an increasing function of the final mass M2, a pos- 
itive 72 would give a relatively higher merger rate for high 
mass, M > M, , halos (where the character mass, M*, has 
the usual definition of (7(M«) = 5). Choosing 71 > skews 
the shape of the progenitor mass functions by boosting the 
ratio of low mass to high mass progenitors. Since a\ > 02, 
setting 71 > boosts the overall merger rate and so needs 
to be compensated for by a the lower value of Go. Allow- 
ing all three parameters to vary, consistently good fits are 
found with Go = 0.57, 71 = 0.38 and 72 = —0.01 with a re- 
duced rms deviation from the MS data of CTcmf = 0.055. Over 
most of the range over which it is employed G(a\ja2; 52/02) 
remains less than but of order unity. The conditional mass 
functions produced by this new set of trees are shown by the 
dashed lines in Fig. 1. We see that this minor change to the 
merger tree algorithm has resulted in merger trees that axe 
in good agreement with the N-body simulation results over 
a wide range of masses and redshifts. The only mass bins 
where the MC and MS conditional mass functions are not 
in good agreement are the bins with Mi > M2. In a truly 
hierarchical model such as is produced by our algorithm Mi 
can never be greater than M2. In the MS data Cole et al. 
(2007) noted that Mi > M2 happens occasionally for low 
mass halos due to the temporaxy, premature linking of EOF 
groups. 

3.2 Main Progenitor Mass E\inctions 

The parameters of our new MC merger tree algorithm were 
tuned to produce good agreement with the conditional mass 
functions plotted in Fig. 1 and so the level of agreement is 
perhaps not surprising. However we can go further and test 
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Figure 2. The mass distributions of the first and second most massive progenitors. The plotted quantities fist and /2nd a^re the 
contributions to the overall conditional mass functions plotted in Fig. 1 provided by the 1st and 2nd most massive progenitors respectively. 
The panels correspond directly to those of Fig. 1 and are labelled by the final halo mass M2 and redshift zi of the progenitors. The 
histograms show the results from the Millennium Simulation with distribution the fist plotted with heavy lines and /2nd with light lines. 
The corresponding predictions of the GALFORM and new Monte-Carlo algorithms are shown by the heavy (/ist) and light (/2nd) dotted 
and dashed curves respectively. The 20 particle mass resolution of the Millennium Simulation is shown by the vertical dotted line, but 
only plays a role for the z = 4 progenitors of the lowest mass, M2 = 10^2 Mq, halos. 
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the success of the algorithm by testing other statistical prop- 
erties of the merger trees. Fig 2 plots the mass functions for 
the first and second most massive progenitors for the same 
selection of redshifts and final masses as in Fig 1. This is an 
interesting test of the merger trees as often in galaxy for- 
mation applications it is the most massive progenitors and 
mergers between them that are most important in determin- 
ing the properties of the galaxies hosted by the halos. It also 
tests an aspect of the merger trees that cannot be predicted 
by EPS theory alone as it involves how often one has a pro- 
genitor of a given mass and not just the mean number of 
such progenitors. 

As noted in Cole et al. (2007) the mass functions for the 
first and second most massive progenitors given by the orig- 
inal GALFORM algorithm do a good job of matching the 
shape and relative positions of the two distributions, but 
systematically underestimate both masses with increasing 
redshift. This is completely remedied in the new algorithm 
which matches the positions and shapes of the N-body dis- 
tributions extremely accurately. 

3.3 Major Mergers 

Another important aspect of the merger trees is the occur- 
rence of major mergers. In galaxy formation models major 
mergers between galaxies, which occur after halo mergers, 
are often deemed to be responsible for initiating bursts of 
star formation and for converting disc galaxies to spheroidal 
systems. Thus it is interesting to see what level of agreement 
our new algorithm has with estimates from the MS. Fig. 3 
compares the redshift distribution of the most recent major 
merger for halos of various final masses. Here a major merger 
has been defined as a merger between two halos where the 
smaller is at least fraction /major ~ 0.3 of the mass of the 
larger. At each redshift we find the most massive progenitor 
of the final halo and record the lowest redshift at which one 
of these progenitors is undergoing a major merger. 

It was found in Cole et al. (2007) that the original GAL- 
FORM algorithm, shown by the dotted line in Fig. 3, signif- 
icantly overestimated the number of recent major mergers. 
Fig. 3 shows that this shortcoming is very largely overcome 
by our new algorithm. The redshift distributions of the most 
recent major mergers match accurately the overall shape of 
those from the MS including their dependence on final halo 
mass. 



3.4 Overall Mass Functions 

The above comparisons to the results of the MS test the 
mass range of the merger trees that is most important for 
galaxy formation applications, but it is also interesting to 
probe whether our new algorithm remains plausible for much 
larger ranges in mass. Sheth & Tormen (1999) (see also 
Jenkins et al. 2001) have shown that for a wide range of 
initial conditions and redshifts that the halo mass function 
has a universal form. A good analytic match to this universal 
form for the fraction of mass in halos of mass M is provided 
by the Sheth & Tormen (2002) mass function 
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Figure 3. The redshift distribution of the most recent major 
mergers of halos with different final masses M2. Here a major 
merger is defined as a merger of the most massive progenitor of the 
final halo with a second halo whose mass is at least /major = 0.3 
times that of the main progenitor. The histogram shows the N- 
body results and the dotted and dashed lines show the results 
from the original GALFORM and new Monte-Carlo algorithm 
respectively. 
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with A = 0.322, a = 0.707, p = 0.3 and the mass dependent 
variable v = 5/a{M). Taking this as a good description of 
the mass distribution of halos at redshift z = one can 
generate a grid of merger trees rooted at 2 = 0, weight 
them by their redshift z — abundance and compute the 
overall abundance of progenitor halos at any earlier redshift. 
If the merger the tree algorithm is in good agreement with 
N-body simulations then these z > mass functions should 
be in good agreement with the Sheth- Tormen mass function 
evaluated at that redshift. 

Fig. 4 compares the Sheth- Tormen mass function with 
those determined with both the original GALFORM and 
new merger trees. In the top panel one sees that the high 
mass exponential cut off to the mass function systematically 
moves to lower v at higher redshift. In other words, as we 
saw with the conditional mass functions in Fig. 1, the char- 
acteristic mass evolves too rapidly in these trees. In contrast 
in the lower panel we see that with our new trees this sys- 
tematic error is greatly reduced and the abundance of high 
mass halos matches the Sheth- Tormen prediction quite ac- 
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Figure 4. The solid curve sliows tlie Slietti-Tormen tialo mass 
function expressed in terms of the variable u = S{z)/cr{M). This 
is compared to mass functions at redshifts z = 0.5,1,2 and 4 
determined by generating grids of merger trees starting at 2 = 
and counting their progenitors at these higher redshifts. The top 
panel shows the results for the original GALFORM merger trees 
and the bottom panel the results for our new merger trees. 

curately over a wide range of redshift. Note that for each 
merger tree mass function the turnover at low masses (low 
v) is due to the specified mass resolution of the trees. At 
higher redshift a fixed mass implies higher v = 5{z)/a{M) 
and so the mass resolution causes deviations at higher and 
higher v. 



4 DISCUSSION 

It is perhaps surprising that an algorithm motivated by EPS 
theory, which is only a function of the smoothed linear the- 
ory overdensity at a point, is able to accurately describe 
the complete merger histories of dark matter haloes in a 
fully non-linear N-body simulation. The EPS theory, as de- 
rived by Bond et al. (1991), makes the following series of 
assumptions none of which can rigorously be true. It as- 
sumes that virialized halos form when the linear theory over- 
density equals the threshold for collapse given by the pure 
spherical collapse model; the linear overdensity at a given 
point in space is assumed to vary with the smoothing scale 
as an uncorrelated (Brownian) random walk (the sharp k- 
space filtering approximation); when assigning mass points 
to halos of mass M no condition is set to require that these 



mass points should lie in spatially localised regions capable 
of forming halos of that mass. 

One might have thought that a more natural start- 
ing point for developing an accurate merger tree algorithm 
would have been the ellipsoidal collapse model of Sheth & 
Tormen (1999); Sheth et al. (2001) as its mass function much 
more accurately matches that of N-body simulations (al- 
though a free parameter, , is adjusted to achieve this fit). 
However, it is not easy to work with this model as there is no 
simple analytic expression for the conditional mass function 
for small timesteps. Furthermore, the results of this more 
complicated model can often be approximated by minor 
modifications of the formulae that are derived using the EPS 
formalism. For example, Giocoli et al. (2007) have shown 
the inserting a factor of ^/q — 0.84 into the EPS formation 
time formula of Lacey & Cole (1993) results in a reasonable 
match to the predictions of the ellipsoidal collapse model. 
Our algorithm was motivated by the EPS formalism, but 
the modification we introduce in equation (6) means that 
its predictions are no longer those of the EPS formalism. 
If instead one were trying to come up with an algorithm 
based on the ellipsoidal collapse model, then the end result 
might well be very similar. In fact, as noted in Section 3.1, 
the y/q = 0.84 factor advocated by Giocoli et al. (2007) is 
equivalent to our Go factor. The other assumptions of the 
EPS theory, listed above, and not addressed in the ellip- 
soidal collapse model must also play a role in determining 
merger histories. By adopting the modification defined in 
equation (6) and fitting directly to N-body simulation re- 
sults, our model is fitting the net effect of all the additional 
physics and not just that due to the shape of the density 
perturbation. 

After we completed this project Neistein & Dekel (2007) 
presented an alternative algorithm to generate dark matter 
halo merger trees based on fitting log-normal distributions 
to progenitor mass functions expressed in scaled mass and 
time variables. Their algorithm, which is of very comparable 
speed to ours, is also tuned to fit the conditional mass func- 
tions of merger trees extracted from the Millennium simu- 
lation. There will be some differences in the results of the 
two algorithms because the Millennium Simulation merger 
trees used by Neistein & Dekel (2007) are not the sim- 
ple friends-of-friends merger trees we constructed for this 
project, but instead the "DHALO" merger trees that were 
constructed by the Durham Group and used in the semi- 
analytic galaxy formation model of Bower et al. (2006). Both 
sets of trees are based on the same catalogues of friends-of- 
friends groups, but the "DHALO" algorithm uses additional 
information concerning substructures identified using SUB- 
FIND (Springel et al. 2001). (There is some discussion of 
the additional criteria useful for galaxy formation calcula- 
tions in Harker et al. (2006).) We opted not to use these 
trees since a criterion that delays the time at which the 
merger is deemed to take place has the side effect of causing 
some halos to loose mass prior to the merger. This artifi- 
cially increases the occurence of progenitor halos that are 
more massive than their descendents and so slightly distorts 
the conditional mass functions. 



^ This parameter is denoted a in Sheth & Tormen (1999) and q 
in Giocoli et al. (2007) 
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5 CONCLUSIONS 

Wc have presented a new Monte-Carlo algorithm to generate 
dark matter halo merger trees. The algorithm is a modifi- 
cation of the Extended Press-Schechter algorithm described 
in Cole et al. (2000). The change we have made to the al- 
gorithm was motivated empirically and tuned to match the 
conditional mass function of halos extracted from the Millen- 
nium Simulation (MS, Springel et al. 2005). We find that not 
only can we get a very accurate match to these conditional 
mass functions over a wide range of mass and redshift, but 
that the other statistical properties of the new trees match 
very well those from the Millennium Simulation. The im- 
provement in accuracy over the algorithm previously used 
in the GALFORM semi-analytic code Cole et al. (2000) is 
very significant and should make the new algorithm a very 
useful tool. 

While our algorithm has been tuned to match the re- 
sults of MS, which is a particular ACDM model, we would 
expect it to a significant improvement over EPS based al- 
gorithms for quite a wide range of CDM-like initial con- 
ditions. The overly rapid evolution in the typical mass of 
progenitor halos was a generic problem with the old algo- 
rithm and the reduced merger rate of the new algorithm 
should be an improvement in all cases. Wc have made a for- 
tran90 implementation of algorithm available at http:/ /star- 
www.dur.ac.uk/~cole/merger_trees . 
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APPENDIX A: THE SPLIT ALGORITHM 

Given a halo of mass M2 at redshift z, the task of the split 
algorithm is to take a small step, Az, to higher redshift 
determine the mass accreted in this interval in unresolved 
halos with masses less than Mres and determine whether or 
not the halo undergoes a binary split. If the halo does split 
then it must determine the masses of the two fragments. In 
the description below we make use of the following notation. 
We denote minus logarithmic slope of the a{M) relation 
by a{M) = — dlntr/dlnM and its values at masses M2, 
M2/2 and Ml = qM2 by 02, Qh and ai(q) respectively. 
Similarly we denote the values of a{AI) at AI2, M2/2, A/res = 
QtcbALz and Mi by (T2, (Th, o"rcs and (Ti{q) respectively. With 
this notation the expression in equation (3) for number of 
fragments produced per unit interval of q produced in a 
redshift step A« can be written as 

= S{q) R{q) Az, (Al) 



dq 
where 



R{q) 



aiiq) V{q) ( (2g)''ai(g) 



Oh 



BqP 



(A2) 
(A3) 
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Viq) = 



[cTi(g)2-cr§]3/2 



(A4) 



and 7] — /3 — 1 — 7i/i. We have written the expression for 
dN/dq in this form so that, as detailed below, we can choose 
the parameters B, (5 and /x such that R{q) < 1 for gres < 
Q < 1/2 and S{q) oc q^~^ is a simple power law. This results 
in several very useful properties. First, 



1/2 



S{q) dq Az 



(A5) 



provides an upper limit on the expected number of re- 
solved fragments split off the main halo in step Az. Wo 
use this to choose the step size by taking Az to be the 
minimum of ei\/2(o"h ~ o^^^'^ Id&jdz and the value given 
by equation (A5) when A/upper = ti (by default, we take 
ei = €2 = 0.1). The first constraint ensures that the ex- 
ponent in equation (1) is small so that the oquatiou (2) is 
correct to first order and the second constraint ensures that 
multiple splittings in one timestep are negligible. 

Having determined Az, the next step is to evaluate F 
from equation (5) to determine fraction of mass that is ac- 
creted in unresolved lialos in this timestep. The expression 
defining F can be simplified to the form 



y n (72 Vo"2/ dz 



(A6) 



where we have made the substitution u = a2/icrf — a|)^^^ 
and the integral 



rn 

Jo 



(l-hl/wT/'c 



(A7) 



with tires = cr2/(cr?es — o"!)^''^- Since this integral has no 
dependence on M2, z or a{M) it can be tabulated as a simple 

look-up table for any chosen value of the parameter 71. In 
the original GALFORM algorithm it reduces to J(wros) = 

tires. 

The next step is to generate the first of three uniform 
random variables in the range to 1. If this first variable, ri, 
is greater than A^uppcr evaluated with the selected Az then 
no split occurs at this timestep and M2 is just reduced to 
M2(l — F) to account for the accreted mass. If ri < iVupper 
we generate a second random variable r2 and transform it 
using q = (gr'es + {2^^ — q^cs)'''2Y^^ so that it is drawn from 
the power- law distribution g''"^ in the range grcs < q < 
1/2. Finally we generate a third random variate rs and only 
accept g if rs < R{q). In the case that q is rejected we again 
simply reduce M2 to M2(l — F), but if q is accepted we 
generate two fragments with masses qM2 and M2(l — F — 
q). This rejection step ensures that q is being drawn with 
the correct normalization from the probability distribution 
defined by equation (Al). 

For this algorithm to work wo require R(q) < 1 for 
gres < q < 1/2. Referring to equation (A3), in all CDM 
models, a{M) > and d{a)/dM > and so the first term 
Qi(g)/Qh is necessarily less than one. Also these conditions 
imply that the function V{q) is monotonically increasing and 
ln(V) versus ln(g) is concave upwards for < g < 1/2. This 
means that V{q) is bounded from above by the power law 
Bq'^ chosen to satisfy Bq^ = V{q) for q = gres and q = 1/2. 
In other words with this choice of B and /? the second term 



in equation (A3), V{q)/Bq^ , is less than or equal to one. 
Finally if we choose 

f Qh if 



H = 



ln(a-res/crh) 



In 2gr, 



if 



71 > 
71 < 



(A8) 



then regardless of the sign of 71 the last factor 
^i22)_£]^~j , is also less than or equal to one and so R{q) 

is always less than one as required. 

The merger tree produced by this algorithm has no 
directly imposed time/redshift resolution and comprises 
of only binary mergers. However, we typically rebin each 
merger tree onto a discrete grid of predefined rcdshift snap- 
shots. With this coarser time resolution the mergers occur- 
ring between snapshots can involve three or even more halos. 



